Clausal Coordinate Ellipsis in German: The TIGER Treebank as a Source of Evidence
نویسندگان
چکیده
Syntactic parsers and generators need highquality grammars of coordination and coordinate ellipsis—structures that occur very frequently but are much less well understood theoretically than many other domains of grammar. Modern grammars of coordinate ellipsis are based nearly exclusively on linguistic judgments (intuitions). The extent to which grammar rules based on this type of empirical evidence generate all and only the structures in text corpora, is unknown. As part of a project on the development of a grammar and a generator for coordinate ellipsis in German, we undertook an extensive exploration of the TIGER treebank—a syntactically annotated corpus of about 50,000 newspaper sentences. We report (1) frequency data for the various patterns of coordinate ellipsis, and (2) several rarely (but regularly) occurring ‘fringe deviations’ from the intuition-based rules for several ellipsis types. This information can help improve parser and generator performance.
منابع مشابه
Clausal Coordinate Ellipsis and its Varieties in Spoken German: A Study with the TüBa-D/S Treebank of the VERBMOBIL Corpus
Grammar rules for Clausal Coordinate Ellipsis (CCE) are based nearly exclusively on linguistic judgments (intuitions). For German, the extent to which grammar rules based on this type of empirical evidence generate all and only CCE structures that populate text corpora, has only been explored with the TIGER treebank of written newspaper text. How well these rules fit spoken German is unknown. I...
متن کاملIncremental sentence production inhibits clausal coordinate ellipsis: A comparison of spoken and written language
One of the benefits of incremental sentence production is to reduce the working memory capacity needed for advance planning: The planning units can be of considerably smaller size (measured in terms of word length) than in case of non-incremental production. The same advantage has been claimed for the various forms of ellipsis, which preempt the need to plan the detailed shape of one or more co...
متن کاملIncremental sentence production and clausal coordinate ellipsis :
From two corpus studies into varieties of clausal coordination in English (Meyer, 1995 and Greenbaum & Nelson, 1999), it is known that the incidence of clausal coordinate ellipsis (CCE) is about two times higher in written than in spoken language. We present a treebank study into CCE in written and spoken Dutch and German which confirms this tendency. Moreover, we observe considerable differenc...
متن کاملA Comparison of Clausal Coordinate Ellipsis in Estonian and German: Remarkably Similar Elision Rules Allow a Language-Independent Ellipsis-Generation Module
We compare the phenomena of clausal coordinate ellipsis in Estonian, a Finno-Ugric language, and German, an Indo-European language. The rules underlying these phenomena appear to be remarkably similar. Thus, the software module ELLEIPO, which was originally developed to generate clausal coordinate ellipsis in German and Dutch, works for Estonian as well. In order to extend ELLEIPO’s coverage to...
متن کاملA Contrastive Study of Persian and English Written Discourse: Ellipsis in Realistic Novels
This study aspires to examine the concept of ellipsis by comparing and contrasting English and Persian written texts. For this purpose, three Persian novels and three English ones were selected. These novels were analyzed carefully; they were compared and contrasted for types and amount of ellipsis used, through a Chi-square analysis. The results of the data analysis revealed that various t...
متن کامل